Validation of risk prediction for outcomes of severe community-acquired pneumonia among under-five children in Amhara region, Northwest Ethiopia

Background Globally there are over 1,400 cases of pneumonia per 100,000 children every year, where children in South Asia and Sub-Saharan Africa are disproportionately affected. Some of the cases develop poor treatment outcome (treatment failure or antibiotic change or staying longer in the hospital or death), while others develop good outcome during interventions. Although clinical decision-making is a key aspect of the interventions, there are limited tools such as risk scores to assist the clinical judgment in low-income settings. This study aimed to validate a prediction model and develop risk scores for poor outcomes of severe community-acquired pneumonia (SCAP). Methods A cohort study was conducted among 539 under-five children hospitalized with SCAP. Data analysis was done using R version 4.0.5 software. A multivariable analysis was done. We developed a simplified risk score to facilitate clinical utility. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC) and calibration plot. Bootstrapping was used to validate all accuracy measures. A decision curve analysis was used to evaluate the clinical and public health utility of our model. Results The incidence of poor outcomes of pneumonia was 151(28%) (95%CI: 24.2%-31.8%). Vaccination status, fever, pallor, unable to breastfeed, impaired consciousness, CBC abnormal, entered ICU, and vomiting remained in the reduced model. The AUC of the original model was 0.927, 95% (CI (0.90, 0.96), whereas the risk score model produced prediction accuracy of an AUC of 0.89 (95%CI: 0.853–0.922. Our decision curve analysis for the model provides a higher net benefit across ranges of threshold probabilities. Conclusions Our model has excellent discrimination and calibration performance. Similarly, the risk score model has excellent discrimination and calibration ability with an insignificant loss of accuracy from the original. The models can have the potential to improve care and treatment outcomes in the clinical settings.


Methods and materials
An institutional based prospective cohort study was conducted among children less than 5 years of age hospitalized with severe community-acquired pneumonia in Amhara Regional State, Northwest Ethiopia. The study was conducted from February to May/ 2020 in public referral hospitals (Felege-Hiwot Comprehensive Specialized Hospital (FHCSH), Tibebe Ghion Specialized Hospital (TGSH), and Debre-Markos Comprehensive Specialized Hospital (DMCSH). FHCSH and TGSH are located in Bahir Dar city, the capital of Amhara Regional State at a distance of 565kms from Addis Ababa capital city of Ethiopia. Both are governmental Hospitals providing inpatient services for children in the pediatric ward and have an emergency outpatient department (EOPD) which can admit for 24-72hours. And DMCSH is Referral Hospital in East Gojam Zone in Debre Markos city. Debre Markos city is the capital of East Gojam located at a distance of 185kms away from Addis Ababa. Similarly, it is a governmental hospital providing both inpatient and outpatient health care services in the region.
The study domain for our study is children from 2-59 months of age diagnosed with severe community-acquired pneumonia based on World Health Organization diagnosing (WHO) criteria of Pneumonia Severity Index (PSI) at the selected hospitals admitted to pediatrics emergency outpatient department or pediatrics wards. In the hospitals, for the purpose of intervention and classifications of pneumonia, SCAP is defined as children who reported chest indrawing with or without fast breathing or children who had any general danger signs such as: inability to breast feed/eat, vomiting everything, grunting, lower chest endowing, altered mental status [13,15]. Children diagnosed with SCAP but died before the start of medication, and children diagnosed with SCAP at admission, but later whose diagnosis changed at any time during the follow-up were excluded from our study.
The sample size was determined based on the minimum standard of 10 events per candidate predictor considered, thus the formula used was N = (n × 10)/I, where N is the sample size, n is the number of candidate predictor variables, and I is the estimated event rate in the population [20]. We have 18 potential prognostic determinants for severe communityacquired pneumonia, and the prevalence of one of the components for poor outcomes of severe community-acquired pneumonia was 35% [21]. Therefore, the calculated sample size was 514, and adding 10% refusal for participation rate, the total sample size required was 565 children. As a sampling procedure, we allocated the sample size to the three hospitals above based on their previous monthly admission rates taken from DHIS reports. So, 203 cases from FHCSH, 147 cases from TGSH, and 215 cases from DMCSH were selected consecutively.

Outcome measurement
Poor outcomes of severe community-acquired pneumonia said to be occurred if treatment failure or antibiotic change or staying longer in the hospital or death occurred during followup days, otherwise, good outcome.

Operational definitions for some terminologies
Treatment failure: treatment failure operationalized as persistence of features of severe pneumonia after initiation of antimicrobial therapy or worsening clinical condition within 48-72 hours of the commencement of the antibiotics [22,23].
Antibiotic change: is a shift from one drug to the other (first line to second line) after taking at least two doses &, A longer stay in the hospital: staying for more than five days in the hospital regarded as longer [7]. Death: If a child has died during the hospitalization days and confirmed by health professionals.
Complete Blood Count (CBC)-It is a composite variable. Laboratory findings in an increased in the level of platelets, white bloods, neutrophils, or presence of anemia categorized (abnormal), otherwise, the finding is normal.

Prognostic determinants
Sociodemographic factors (age, sex, immunization status, delayed presentation (> 5 days), exclusive breastfeeding, and complementary feeding), clinical at presentation (breathlessness, fever, pallor, cyanosis, grunting, and vomiting), co-morbidities (Human Immune Virus (HIV), diarrhea in the last 2 weeks, cough in the last 2 weeks, stunting and wasting), environmental conditions (cooking place, fuels used for cooking, smokers in the house, family size).
A structured questionnaire was developed based on the literature available on the subject [21,[23][24][25][26], and a medical chart used to collect data for patients in the hospitals. Six data collectors and two supervisors were selected. Senior health officers and nurses who trained on Integrated Management of Neonate and Child Illness (IMNCI) were involved in the data collection. Training for one day was provided to the data collectors and supervisors on objectives of the study, steps, and approaches for interviewing, to take accurate and proper measurements of weight, height/length and calculate height for age (H/A), weight for height (W/H) and how to take correct vital signs and physical examinations. The data were collected in phases, the first data were taken on admission date as baseline information, and the second was on the third day about the clinical prognosis and antibiotics taken. It is concerned with the clinical status of the patient, either stable, same, or worsen. The final data were collected on the discharge time. If the child was not discharged, the final data were collected on the sixth to the eighth day, date of discharge, discharging conditions (improved same, none improved or death), antibiotics are taken, and treatment failure and other important information were registered. If the final diagnosis was other than severe community-acquired pneumonia, the case was excluded.

Data processing and analysis
Data were collected using EPI DATA, version 3.02; and were exported to R statistical programming language version 4.0.5 for analysis. Descriptive statistics such as mean, standard deviation (SD), median, interquartile range (IQR), and percentages were performed and presented in tables.
Bivariate logistic regression analysis was performed to show unadjusted associations between each prognostic factor and poor outcomes of severe community-acquired pneumonia and to select potential candidate prognostic determinants for multivariate prediction modeling. Variables with a p-value less than 0.25 in the bivariate regression were retained for multivariate analysis. We performed multicollinearity tests between each determinant using variance inflation factor (VIF), we used VIF>10 as a cut-off point to exclude a variable for multivariate modeling [27].
A backward stepwise logistic regression analysis was performed to come up with the final reduced model. The regression coefficients with their 95% confidence levels and p-values were reported for the models. Model accuracy was checked using the area under the ROC curve (discrimination) and calibration plot (calibration) respectively. Interpretation of discrimination ability of a model; an AUC of 0.5 is worthless, and AUC from 0.80-.90 and 0.90-1.0 is good and excellent model performance respectively [28,29].
To internally validate our model, a bootstrapping technique was used. For model derivation, we had drawn 1000 random bootstrap samples with replacement. Similar to the original model, the regression coefficients and AUC of the validated model were reported and compared with the original model. We used a decision curve analysis (DCA) to evaluate the clinical and public health impact of the model of standardized net benefit across a range of threshold probabilities (0 to 1) [30].
To develop a simplified and easily applicable prediction score for the outcomes of severe community-acquired pneumonia, each regression coefficient in the validated model was divided by the smallest coefficient and rounded to the nearest integer. We determined the total score for each individual by assigning the points for each variable present and adding them up. For simple interpretation in a clinical setting, we categorized the total risk score into two based on the Youden index (optimal cut-off point). Then, patients were categorized into high-risk or low-risk groups based on the summation of individualized risk scores. Since, a simplification of a risk score might cause loss of information, which might result in some loss in prognostic accuracy; we created a model for the risk score to compare its accuracy with the original one. So, ROC was plotted and an AUC with its 95% confidence level was computed to evaluate the discriminatory power of the scoring system.

Ethics approval and consent to participate
Ethical clearance was obtained from the Ethical Review Committee of Bahir Dar University. Letters were written from Bahir Dar University to each study hospital for a corporation. Permission letters were obtained from each hospital administrative office to conduct the study on the ground. The study participant's information sheet was attached to the front page of each questionnaire; before proceeding to the data collection process, the caregiver/family of each patient was asked for participation and well-informed. Verbal informed consent was received from the caregiver/family of the patient following explaining the purpose of the study.

Baseline demographic and clinical characteristics of under-five with pneumonia
Regarding to the nature of our dataset, there is no missing data. We aimed for 565 children to be included in our study; however, 539 children were included in this study, giving a response rate of 95%. In this study, 270(50.1%) were females. The median age of the children was 17 months (IQR: , 200(37.1%) were in the ranges of 2 to 11 months age. One hundred forty two (26.3%), and 65 (12.1%) of children had cyanosis and paleness upon admission. This study also indicated that 239(44.3%) had vomiting, 243(45.1%) were unable to breastfeed, 138 (25.6%) had impaired consciousness, 107 (19.9%) were wasted, and 101(18.7%) were stunted, 83(15.4%) were admitted to ICU.The mean body temperature was 37.6 (±1.0 SD), and the median HCT was 33.2(IQR: 37.6-33.2) and. The median respiratory rate was 56(IQR: 62-48) ( Table 1).

The incidence of poor outcomes and prognostic determinants of pneumonia among under-five children
The findings of this study revealed that the proportion of treatment failure, switching treatment, staying longer in the hospital, and death rates were 51(33.6%), 34(22.7%, 61(40.7%) and 5(3%) respectively. However, the overall incidence of poor outcomes of pneumonia was 151 (28%) (95%CI: 24.2%-31.8%). We conducted a bivariate logistic regression analysis to identify prognostic determinants of poor outcomes of Pneumonia. Several prognostic determinants were entered in the bivariate modeling to select candidate predictors for the multivariate model. Accordingly, 14 candidate predictors were eligible for further regression analysis ( Table 2).

A predictive model for poor outcomes of severe community-acquired pneumonia in children
We entered 14 candidate prognostic determinants from the bivariate model, into original multivariate model. Prognostic determinants such as vaccination status, fever, pallor, unable to breastfeed, impaired consciousness, CBC positive, entered ICU, and vomiting remained in the reduced mode. To this end, we developed a prediction model, and risk scores based on the results of the reduced model ( Table 3).
The area under the receiver operating characteristics curve (AUC) of the original model was 0.926(95% confidence interval 0.897 to 0.952) (Fig 1A). The model fitness test had a pvalue of 0.176; the calibration curve is nearly 45 degrees, showing that there is no difference between predicted and the observed probabilities ( Fig 1B).  To avoid over-interpretation and minimize too optimistic results from the original model; we used a bootstrapping technique using mrs package to validate our model. This study used 1000 bootstrap samples with replacement; the corrected AUC was 0.913, 95%CI (0.899, 0.956) and the optimism coefficient for the validated model was 0.0135. The β coefficients from the bootstrapped model produced marginally the same results as the original β coefficients. The calibration plot for the validated model is shown (Fig 2); indicates a very good agreement between predicted and observed probabilities; very slightly that the apparent curve seems to outperform the bias-corrected curve between 0.2 and 0.7 predicted probabilities. Therefore, given the limited optimism, and excellent calibration, the model might perform well in a new sample.
Regarding the decision to use our model, the decision curve outperforms the default strategies (referring all and none) across the entire range of threshold probabilities. This implies that our model has the highest clinical and public health importance. Therefore, decisions made using the model such as safely discharging children with some medications or keeping children for more intensive care in the hospitals has a higher net benefit (Fig 3).

Risk classification using a simplified risk score
For practical utility, we developed a simplified risk score from the validated model. The risk score produced prediction accuracy of an AUC of 0.89 (95%CI: 0.853-0.922); which is nearly a comparable prediction accuracy with the original model, a curve with reddish-purple color (See Fig 4). This reveals that the probability of a randomly selected child with poor outcomes of pneumonia will receive a higher risk score than a randomly selected child without poor outcomes of pneumonia is 89%. In this study the maximum total risk score is 27; for simple interpretation in the clinical settings, we categorized risk scores into less than nine points (< 9) (low-risk group), and greater or equal to nine points (� 9) points (high-risk group) based on Youden index (optimal cut-off point) which corresponds to the probability of 0.333 in the model. Therefore, a child can have a minimum and maximum risk score of 0 and 27 respectively. Out of the total 151(28%) cases   of poor outcomes of SCAP, 47(11.5%) were in the low-risk group, and 104(81%) were in the high-risk group ( Table 4)

Discussions
To halt the loss of millions of young lives from preventable causes including pneumonia, the World Health Organization and UNICEF set an integrated Global Action Plan for the Prevention and Control of Pneumonia and Diarrhea (GAPPD). The plan had targeted less than three child pneumonia deaths per 1,000 live births by 2025. The plan brings together critical services and interventions to create healthy environments, encourages activities that save children

PLOS ONE
Risk scores for poor outcomes of severe community-acquired pneumonia (SCAP) in children from diseases in order to ensure every child has the access to recognized and appropriate preventive and clinical interventions [31]. Similarly, the Sustainable Development Goal (SDG) targeted less than 25 under-five child death per 1,000 live births by 2030 [32]. As part of these targets, accurate risk-stratification tools and models to guide clinical decision-making in the health care settings are very essential. Therefore, we developed a prediction model and risk scores for poor outcomes of SCAP for under-five children. A total of 539 under-five children admitted with SCAP in the hospitals were studied, and the incidence of poor outcomes of SCAP was 28%. We predicted poor outcomes of SCAP by using prognostic determinants that remained in our reduced model (vaccination status, fever, Pallor, unable to breastfeed, impaired consciousness, vomiting on admission, CBC positive, and being admitted to ICU).
The findings of this study revealed that our model has produced discrimination performance of AUC 0.927(95% confidence interval 0.90 to 0.96) with calibration a p-value of 0.132; the calibration curve is almost in 45 degrees, showing a magnificent agreement between predicted and the observed probabilities (Fig 1B). To make internal validation of our study, a bootstrapping technique was used to minimize too optimistic results in the original model. We used 1000 bootstrap samples with replacement; the adjusted AUC was 0.913, 95%CI (0.899, 0.956) and optimism for the validated model was 0.0135. The bootstrapped model has produced marginally the same β coefficients as the original model; and the calibration plot for the validated model is shown (Fig 2); indicates fits well over the ranges of predicted probabilities. Therefore, given the limited optimism, and excellent calibration, the model might perform well in a new sample in the future. To this end, our risk score model has magnificent discrimination ability, with an AUC of 0.89 (95%CI: 0.85-0.92).
To our knowledge, there are no prediction models for poor outcomes of SCAP for children presented with severe community-acquired pneumonia in children less than five in low and middle-income countries; however, several prognostic models were available to predict mortality in children hospitalized with SCAP.
A study conducted by Dana W Flanders and his colleagues, the area under the ROC curve for the Pneumonia Severity Index (PSI) model to predict in-hospital mortality was 0.847, with a calibration p-value of < 0.001 [26]. A prediction study on mortality in children with Respiratory Illness in western Kenya using a modified Respiratory Index of Severity in Children (RISC); the model resulted in an AUC of 0.854 with optimism coefficient of 0.002 [33]. Similarly, an area under the ROC curve of another Pneumonia Severity Index (PSI) study to predict mortality was 0.84 with a p-value < 0.0001 [34]. The findings of these studies showed that the discrimination performance of tools to identify children at greater risk of death from SCAP is very good. The accuracy of our model is slightly higher than a study conducted in Gambia, where the aim of this study was to predict the mortality due to SCAP in children, and the model had an AUC of 0.88 (95% confidence interval: 0.84, 0.91), sensitivity was 0.78 and specificity was 0.77 [35]. However, our model performance is much higher than a study conducted in Malawi to predict in hospital mortality in children from pneumonia, two models were used for external validations; the Respiratory Index of Severity in Children (RISC) and the modified RISC (mRISC) scores in child pneumonia. The models produced discrimination performance of 0.72, and 0.79 respectively [36]. Showing that, the models have produced good discrimination ability among children with pneumonia.
In a study conducted in the US to predict severe outcomes of SCAP, the model had an accuracy of AUC 0.79 (0.77-0.81). This model is slightly different from ours as it strived to predict severe outcomes of SCAP using mechanical ventilation, shock, or death, while ours is aimed at poor outcomes of SCAP following admission using treatment failure, antibiotic change, prolonged hospital stay, and death during hospitalization. In addition, the earlier study used radiologic infiltrate patterns, and microbiologic data in addition to physical and clinical presentations [37].
Biomarker studies conducted to assist clinical decision-making on community-acquired pneumonia were revealed that inflammatory biomarkers including C-reactive protein (CRP), procalcitonin (PCT), cytokines [38][39][40], and cardiovascular biomarkers such as N-terminal Btype natriuretic peptide (NT-proBNP), proarginin-vasopressin (copeptin), and D-dimer can predict mortality and disease severity in CAP [41][42][43][44]. However, these mechanisms are less likely to be implemented and to be used in routine patient care practice in low and middleincome countries.
Most of the available prediction tools and risk scores for SCAP were conducted to predict mortality and identify low-risk patients that are suitable for ambulatory management and admission to the hospitals. However, identification of patients at high risk for poor outcomes by the existing scores remains suboptimal. Hence, it is essential to conduct a repeated evaluation of hospitalized patients to account for the risk of poor outcomes in the course of the illness. So, our prognostic determinants that predicted SCAP were vaccination status, fever, Pallor, unable to breastfeed, impaired consciousness, vomiting on admission, CBC positive, and being admitted to ICU. These variables are easy to be collected on a bedside that might be used to predict the risk of poor outcomes of a given individual patient for a prompt decision to complement the clinical judgment by health care workers.

Strength and limitations
We developed a risk stratification tool by using easily obtainable prognostic determinants to assist clinical decision making that can be utilized in low and middle income settings, where the availability of imaging and lab tests are limited. However, our findings should be used with great caution because this study should be externally validated before use. The prognostic outcome isn't adjusted for interventions provided.

Conclusions
We developed well-calibrated predicted models that have an excellent discrimination performance for poor outcomes of SCAP among children admitted to the hospitals. Our model has 8 prognostic determinants that can be easily collected from each patient following admission. These variables are easy to be obtained from the diagnostic workup to predict the outcomes of the patients with SCAP. Therefore, our risk score tool could be used in the clinical care settings to identify children at low risk for poor outcomes that could be safely discharged with some medications, and those with a high risk of poor outcomes that require intensive management. External validation is a prerequisite and highly recommended before the implementation of our prediction tool. After external validation, the implementation of our prediction score model can facilitate patient management decisions by offering individualized risk estimates that can be utilized with clinical judgment to enhance the recovery of children with SCAP.